-
Notifications
You must be signed in to change notification settings - Fork 45
Support Bundles: Chunked upload to Sled Agent #8559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
80b04a6
to
cbc04e3
Compare
The doc comments in the OpenAPI definition are already clean in this regard, but I want to caution you against using the word "multi-part" in anything user-facing because it already has a specific meaning in the context of file uploads: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Methods/POST#multipart_form_submission |
If it's any consolation, this is an internal API. I saw the multipart form submission interface, and candidly skipped it for simplicity (I don't want to deal with "delimiters" in support bundles; I just want to make requests at offsets). EDIT: I'll change the PR name; this isn't really a term used in the change anyway |
@@ -494,8 +494,12 @@ impl BundleCollection { | |||
|
|||
// Create the zipfile as a temporary file | |||
let mut zipfile = tokio::fs::File::from_std(bundle_to_zipfile(&dir)?); | |||
let total_len = zipfile.metadata().await?.len(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way John pointed this out on one of my previous PRs but you can SeekFrom::End(0) (since we later seek back to the beginning on line 503) to get the length without having to do an extra stat call. Either way works, just thought I would mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will end up being roughly equivalent; it's a "stat now, seek later" or "seek now, seek later", right? Either way, we need a syscall to get the length
cbc04e3
to
52140ad
Compare
52140ad
to
84be721
Compare
First time looking at support bundle code and I'll give it a better read through, but is the ultimate idea that we are using disk storage as our tracking of the state of an upload? If we never call finalize, then the transferred data would just sit there correct? |
This is correct; the transferred data would exist on the sled agent with the name I think this is okay:
The question of "when and how do we release this storage" is up to Nexus: The background task only considers the bundle "deleted" if the sled agent can affirmatively say: "this bundle has been deleted, or no longer exists". This could happen before "finalize" or after "finalize" -- either way, we don't mark the bundle deleted in the DB until either (1) the sled responds affirmatively, or (2) the disk being used by the sled agent has been expunged |
Awesome, thanks for the explanation! That makes sense to me. |
Support bundle collection in Nexus previously transferred the entire bundle as a streamed file from Nexus -> Sled Agent. Unfortunately, dropshot has upper bounds for streaming file sizes, and as we hit the 2 GiB boundary, we noticed that collection started to fail during transfers.
This PR modifies how Nexus transfers bundles to Sled Agents: it uses an incremental upload scheme, transferring
up to 1 GiB at a time, at a requested offset. This should allow for incremental transfer of files which are much larger than 2 GiB, and makes it easier for us to "re-do" failed operations later.
(Changing behavior on partial failure would be a welcome change, but is out-of-scope for this PR, which tries to only modify the behavior from a "oneshot" upload to a "start, transfer, finalize" API).
Added tests for:
Fixes #8556